Crate finl_unicode
source · [−]Expand description
finl_unicode
is a crate to provide Unicode support for the finl project. This is not necessarily
meant to be a comoprehensive Unicode support, although I will consider adding additional use cases
as necessary. Unicode 14.0.0 is implemented in the current version.
Two features are currently supported:
- Unicode segmentation. (Specify
clusters
as a feature when importing the crate.) For a peekable iterator ofCharIndices
, we extend that iterator to include anext_cluster
method which returnsOption<String>
which will contain the next grapheme cluster if there is one orNone
if there isn’t. - Character category. (Specify
categories
as a feature when importing the crate.) Extends thechar
class with methods for testing the category of the character.
The default is to compile all features. Note that the Rust compiler/linker will not automatically link unused code, so you most of the time, there will be no need to remove features.
Building the crate runs a build script which connects to unicode.org to download the data files.
Modules
The code in this module provides a trait that is implemented against
char
that allows testing
or retrieving the Unicode category for the character as well as two enum
s for identifying
character classes.This module provides two interfaces for accessing clusters from an underlying string. The
GraphemeCluster
trait extends the Peekable
iterators over Chars
or CharIndices
to add a next_cluster
method which returns Option<String>
with the next
cluster if one exists. This is the best method for getting individual clusters from a stream which is normally
only getting char
s but is not recommended if you wish to iterate over clusters.